Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines
نویسنده
چکیده
A pretraining algorithm, which is a layer-bylayer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented in this paper. By considering the deep belief net type of pretraining for the DBM, which is a simplified version of the original pretraining of the DBM, two interesting theoretical facts about pretraining can be obtained. (1) By applying two different types of approximation, a replacing approximation by using a Bayesian network and a Bethe type of approximation based on the cluster variation method, to two different parts of the true log-likelihood function of the DBM, pretraining can be derived from a variational approximation of the original maximum likelihood estimation. (2) It can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. These two theoretical results will help deepen our understanding of deep learning. Moreover, on the basis of the theoretical results, we discuss the original pretraining of the DBM in the latter part of this paper.
منابع مشابه
A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines
A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBMwith approximate maximum-likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that con...
متن کاملHow to Pretrain Deep Boltzmann Machines in Two Stages
A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBMwith approximate maximum-likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that con...
متن کاملA Better Way to Pretrain Deep Boltzmann Machines
We describe how the pretraining algorithm for Deep Boltzmann Machines (DBMs) is related to the pretraining algorithm for Deep Belief Networks and we show that under certain conditions, the pretraining procedure improves the variational lower bound of a two-hidden-layer DBM. Based on this analysis, we develop a different method of pretraining DBMs that distributes the modelling work more evenly ...
متن کاملInductive Principles for Learning Restricted Boltzmann Machines (DRAFT: August 25, 2010)
We explore the training and usage of the Restricted Boltzmann Machine for unsupervised feature extraction. We investigate the many different aspects involved in their training, and by applying the concept of iterate averaging we show that it is possible to greatly improve on state of the art algorithms. We also derive estimators based on the principles of pseudo-likelihood, ratio matching, and ...
متن کاملBoltzmann Machine Learning with the Latent Maximum Entropy Principle
We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have pro posed: the latent maximum entropy principle (LME). LME is different both from Jaynes' maximum entropy principle and from stan dard maximum likelihood estimation. We demonstrate the LME principle by deriving new algorithms for Boltzmann machine pa rameter estimation, and show h...
متن کامل